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(54) Method and apparatus for encoding or decoding audio or video frame data 



(57) For broadcasting purposes a multi-channel 
audio encoder board has been designed. A requirement 
for such encoders is that they are able to operate with 
different encoding parameters. It may happen that 
encoding parameters change during encoding opera- 
tion. In order to avoid the output of invalid data the 
encoding parameters required for a specific processing 
path are added to the input streams for the audio chan- 
nels and become linked with the associated audio data 
and are stored in various buffers together with its audio 
data, i.e. the corresponding encoding parameters are 
kept linked with the audio data to be encoded through- 
out the encoding processing. 
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Description 

[0001] The invention relates to a method and to an apparatus for encoding or decoding audio or video frame data. 

5 Background 

[0002] For broadcasting purposes a 4-stereo-channel MPEG audio encoder board has been designed. A require- 
ment for such encoders is that they are able to operate with different encoding parameters. MPEG allows e.g. various 
sample frequencies and overall data rates. 

w 

Invention 

[0003] A proljlem arises when during normal encoding operation one or more parameters change. This may hap- 
pen when the current type of program changes, e g. from pure speech or news to music. 

/5 Normally the audio frames are processed in an encoder in subsequent different stages, for example conversion to fre- 
quency coefficients in a first stage and bit allocation and quantisation in a further stage. In a path in parallel to the first 
stage the psychoacoustic masking is calculated. A video encoder includes the following stages: block difference stage. 
DCT (discrete cosine transform), quantisation and in the feedback loop inverse quantisation, inverse DCT, motion com- 
pensated interpolation the output of which is input to the block difference stage, wherein the output of the quantisation 

20 is possibly VLC (variable length coding) encoded and buffered before final output and the buffer filling level is used to 
control the quantisation in such a way that encoding artefacts are masked as far as possible. 

[0004] If the encoding parameters would generally change at a time instant where a certain audio frame has been 
processed in such first stage but not yet in such further stage, the data of this frame wilt become useless after having 
been processed in the further stage with the changed encoding parameters. 

25 [0005] In order to avoid such problem a big table with the old encoding parameters and a big table with the new 
encoding parameters could be stored in the encoder for some time for each channel wherein the 'depth' of the tables 
wouki depend on the number of streams within the encoder and which would require repeated updating. All processing 
stages of the encoder would need to have access to both channel tables and would need to determine at which time to 
access which of the tables. In particular in a multichannel encoder in which different channels may change different 

30 encoding parameters at different times, the channels possibly being assigned to different microprocessors, this solution 
could easily produce errors. The tables would require more memory capacity than the solution described below. 
[0006] It is one object of the invention to disclose a method for encoding or decoding audio or video frame data for 
which encoding or decoding parameters are required. This object is achieved by the method disclosed in claim 1 . 
[0007] It is a further object of the invention to disclose an apparatus which utilises the inventive method. This object 

35 is achieved by the apparatus disclosed in claim 5. 

[0008] In the invention intermediately stored general parameter tables are not used. Instead, the encoding param- 
eters required for a specific processing path are added to the input streams for the audio channels and become linked 
with the associated audio data and are stored in the various buffers together with its audio data. i.e. the corresponding 
encoding parameters are kept linked with the audio data to be encoded throughout the encoding processing in the dif- 

40 ferent data streams and data paths. Preferably the original encoding parameters assigned to the processing paths 
become converted to a different format in order to minimise the required word length and/or to facilitate easy evaluation 
in the related processing stages. 

Thereby each data stream can be processed with the correct parameter set without waiting for finishing encoding of the 
okJ data stream and for reset and loading of new parameters before starting encoding of a new data stream with new 
45 parameters. 

[0009] The invention can also be used in audio or video decoders whith a corresponding inverse order of process- 
ing stages. 

[001 0] In principle, the inventive metfiod is suited for encoding audio or vkJeo frame data for which encoding param- 
eters are required, wherein the required encoding parameters become linked at the input of the processing with frames 
so of saki audio or video data to be encoded and throughout different stages in the encoding processing, and wherein in 
each of theses stages the corresponding encoding parameters linked with current frame data to be processed are 
regarded in order to allow switching of the encoding parameters for any frame thereby avoiding encoding of invalid out- 
put data without reset, 
or 

55 for decoding audio or video frame data for which decoding parameters are required, wherein the required decoding 
parameters become linked at the input of the processing with frames of said audio or video data to be decoded and 
throughout different stages in the decoding processing, and wherein in each of theses stages the corresponding decod- 
ing parameters linked with current frame data to be processed are regarded in order to allow switching of the decoding 
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parameters for any frame thereby avoiding decoding of invalid output data without reset. 

[001 1] Advantageous additional embodiments of the inventive method are disclosed in the respective dependent 
claims. 

[0012] In principle the inventive apparatus is suited for encoding audio or video frame data for which encoding 
5 parameters are required, and includes: 

means for linking the required encoding parameters with frames of said audio or video data, said linking means 
arranged near the input of the apparatus; 

means for converting time domain samples into frequency domain coefficients, to the input of which means buffer 
10 means are assigned; 

means for calculating masking properties from said time domain samples, to the input of which means buffer 
means are assigned; 

means for performing bit allocation and quantisation of the coefficients under the control of the output of said mask- 
ing calculating means, to the input of which bit allocation and quantisation means buffer means are assigned. 
15 wherein in said conversion means, in said masking calculating means and in said bit allocation and quantisation 

means the corresponding encoding parameters linked with current frame data to be processed are regarded in 
order to allow switching of the encoding parameters for any frame thereby avoiding encoding of invalid output data 
without reset. 

20 [001 3] Advantageous additional embodiments of the inventive apparatus are disclosed in the respective dependent 
claims. 

Drawings 

25 [0014] Embodiments of the invention are described with reference to the accompanying drawings, which show in: 

Fig. 1 functional block diagram of a 4-channel audio encoder; 

Fig. 2 linked data field Including audio data to be encoded and associated encoding parameters; 

Fig. 3 microprocessor with menrK)ry including linked data fields. 



30 



Exemplary embodiments 



[0015] The audio encoder in Fig. 1 receives four stereo PCM input signals PCMA. PCMB. PCMC and PCMD. E.g. 
MPEG audio data are frame based, each frame containing 11 52 mono or stereo samples. The encoder operating sys- 
35 tem of Fig. 1 may include six DSPs (Digital Signal Processor, not depicted) for the encoding of the four MPEG channels. 
These DSPs form a software encoder which includes the technical functions depicted in Fig. 1 . A suitable type of DSP 
is for example ADSP 21060 or 21061 or 21062 of Analog Devices. As an alternative, the technical functions depicted 
in Fig. 1 can be realised In hardware. 

Synchronisation of the software running on the six DSPs, or on corresponding hardware, is achieved using FIFO buffers 
40 wherein each buffer is assigned to one or some specific frames. This means that at a certain time instant a current 
frame as well as previous frames, the number of which depends from the quantity of available buffers, are present in 
the processing stages. 

A gtot>al parameter switching would cause assignment of the new parameters to also such buffers which still contain 
data to be processed by the previous set of parameters. This would make the content of such buffers useless. In the 

45 invention, however, various encoding parameters like coding mode (mono, stereo, dual, joint stereo), sample rate and 
data rate can be changed *on the f 1/ without reset and without producing invalid encoder output data. 
Between some of the stages asynchronous buffers ASBUF are inserted which allow asynchronous write and read oper- 
ations. Between other stages synchronous txjffers BUF are sufficient The PCM input signals PCMA, PCMB. PCMC 
and PCMD each pass via an asynchronous buffer to a respective converter CONA, CONB, CONC and COND, In such 

so converter an integer-to-f loafing representation conversion of the audio samples to be encoded may take place. It is also 
possible that the encoder processes Integer representation audio samples. 

fn such converter also one or more kinds of energy levels In a frame may be calculated, e.g. energy of all samples of 
the frame or average energy of the samples of a frame. These energy values may be used in the subsequent psycho- 
acoustic processing. 

55 In addition. In such converter the possibly adapted encoding parameters become linked with the frame audio data. In 
respective parameter encoders PENCA, PENCB, PENCC and PENCD the original encoding parameters may be con- 
verted as described above and then fed to CONA. CONB. CONC and COND, respectively. In an MPEG decoder the 
decoding parameters in the transmitted datastream may be adapted correspondingly according to the hardware or soft- 
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f 

ware requirements in the decoder before being (re-)linked to each data frame. 

[001 61 Via asynchronous buffers the output data of CONA. CONB. CONG and COND are fed in parallel to subband 
filt-^rs SUBA. SUBB, SUBC and SUBD and to first left and right channel psychoacoustic calculators Psychol A_L. 
P oho1A_R. Psycho1B_L. Psycho1B_R, PsycholC.L. PsycholC_R, Psycho1D_L and PsycholD_R. respectively. 
5 T subband filters divide the total audio spectrum into frequency bands, possibly using FFT. and may calculate the 
n .' <imum or scale factor of the coefficients in a frequency band or subband. Within the frequency bands a normalisa- 
tion may be carried out. The subband filters take into account the relevant encoding parameters read from the corre- 
sponding upstream asynchronous buffer. 

The first psychoacoustic calculators perform an FFT having a length of e.g. 1024 samples and determine the current 
10 masking information. Each first psychoacoustic calculator can be followed by a second psychoacoustic calculator 
Psycho2A_L. Psycho2A_R. Psycho2B_L. Psycho2B_R. Psycho2C_L. Psycho2C_R. Psycho2D_L and Psycho2D_R. 
respectively, which evaluates the maximum or scale factor values previously calculated in the subband filters. The first 
and second psychoacoustic calculators take into account the relevant encoding parameters read from the correspond- 
ing upstream asynchronous buffers. The output signals of Psycho2A_L, Psycho2A_R, Psycho2B_L. Psycho2B_R. 
15 Psycho2C_L. Psycho2C_R. Psycho2D_L and Psycho2D_R are used in bit allocators and quantisers Bal/Q/E_A. 
Bal/Q/E_B. Bal/Q/E_C and Bal/Q/E_D. respectively, for determining the number of bits allocated and the quantisation 
the audio data coefficients coming from the associated subband filter via a buffer. It is also possible to calculate in the 
second psychoacoustic calculators in addition what is being calculated in the first psychoacoustic calculators and 
thereby to omit the first psychoacoustic calculators. Finally, the outputs of Bal/Q/E_A, Bal/Q/E_B. Bat/Q/E_C and 
20 Bal/Q/E_D pass through an asynchronous buffers and output interfaces AES-EBU_A. AES-EBU_B. AES-EBU_C, 
AES-EBU_D, respectively, which deliver the encoder stereo output signals PCM_Out_A. PCI^_Out_B, PCM_Oul_C, 
PCM_Out_D, respectively. 

[0017] Fig. 2 shows a data field including audio samples or audio coefficients CXDE for a frame. To these sanrples 
or coefficients encoding or decoding parameters PAR are linked or assigned. PAR Includes for Instance mode informa- 
25 tion (mono, stereo, dual, joint stereo), sample rate and data rate information, length of the data field, type of MPEG 
layer. An address pointer POI indicates the begin of the parameter data PAR. 

[0018] In Fig. 3 a microprocessor or DSP ^iP is shown together with its memory MEM. In the memory some data 
fields A to F are depicted which correspond to data fields as shown In Fig. 2. E.g. data fields A. B arKi C may correspond 
to data fields of three succeeding audio frames in one of the data paths of Fig. 1 . Data field A may Include other encod- 
30 ing parameters PAR than data fields B and C. The address of the begin of data field B can be calculated by adding the 
length of data field A to POL 

[0019] The software running on ^P can use the following example commands in C-language for constructing the 
data fields according to Fig. 2: 

35 

typedef struct { 



40 

int bitrate_index 

int sampling^f requency 

45 



) layer; 

50 

"struct" may also contain time stamp information. 



55 #define FRAMESIZE 1152 /*1152 is a decimal nxomber*/ 



4 

BNSDOCID: <EP 1021044A1 I > 



EP 1 021 044 A1 

typedef struct { 



5 

layer info 

10 

float PCMBuf[ FRAMES I ZE] 
} FloatBuffer L Type 

to — — 

[0020] The invention can be used e.g. for MPEG 1. 2 and 4 Audio encoding and decoding for MPEG layers 1, 2 or 
20 3, Digital Video Broadcast DVB. lor AC-3, MD and AAC processing, for DVD processing and Internet applications con- 
cerning audio or video data encoding and decoding. 

Claims 

25 1 . Method for encoding audio or video frame data (PCMA. COiE) for which encoding parameters (PAR) are required. 

characterised in that the required encoding parameters become linked (CONA. CONB, CONC, COND) at the 
input of the processing with frames of said audio or video data to be encoded and throughout different stages 
(CONA. SUBA. BALyQ/E_A) in the encoding processing, wherein in each of theses stages the corresponding 
encoding parameters linked with current frame data to be processed are regarded in order to allow switching of the 

.30 encoding parameters for any frame thereby avoiding encoding of invalid output data (PCM_OutA) without reset. 

2. Method for decoding audio or video frame data (PCMA, COE) for which decoding parameters (PAR) are required, 
characterised in that the required decoding parameters become linked (CONA. CONB. CONC, COND) at the 
input of the processing with frames of said audio or video data to be decoded and throughout different stages in the 

35 decoding processing, wherein in each of theses stages the corresponding decoding parameters linked with current 
frame data to be processed are regarded in order to allow switching of the decoding parameters for any frame 
thereby avoiding decoding of invalid output data (PCM_OutA) without reset. 

3. Method according to claim 1 or 2. wherein the encoding or decoding parameters become converted (PENCA, 
40 PENCB. PENCC. PENGD) to a different format before being linked to the frames of said audio or video data to be 

encoded or decoded, respectively. 

4. Method according to any of claims 1 to 3, wherein to each stage a asynchronous buffer is assigned at its input and 
wherein the asynchronous buffers contain data fields including audio or video frame data (COE) and the associated 

45 encoding or decoding parameters (PAR), respectively. 

5. Apparatus for encoding audio or video frame data (PCMA, COE) for which encoding parameters (PAR) are 
required, and including: 

so - means for linking (CONA, CONB, CONC, COND) the required encoding parameters with frames of said audio 

or video data, said linking means arranged near the input of the apparatus; 

means (SUBA, SUBB, SUBC. SUBD) for converting time domain samples into frequency domain coefficients, 
to the input of which means buffer means are assigned; 

- means (Psycho1A_L. Psycho1A_R. PsycholB_L, Psycho1B_R. Psycho1C_L. PsycholC_R. Psycho1D_L. 
55 Psychol D_R. Psycho2A_L. Psycho2A_R, Psycho2B_L. Psycho2B_R. Psycho2C_L. Psycho2C_R. 

Psycho2D_L. Psycho2D_R) for calculating masking properties from said time domain samples, to the input of 
which means buffer means are assigned; 

- means (Bal/Q/E_A, Bal/Q/E_B. Bal/Q/E_C, Bal/Q/E_D) for performing bit allocation and quantisation of the 
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coefficients under the control of the output of said masking calculating mear^s, to the irput of which bit alloca- 
tion and quantisation means buffer means are assigned, wherein in said conversion means, in said masking 
calculating means and in said bit allocation and quantisation means the corresponding encoding parameters 
linked with current frame data to be processed are regarded in order to allow switching of the encoding param- 
5 eters for any frame thereby avoiding encoding of invalid output data (PCM_OutA) without reset. 

6. Apparatus according to claim 5. wherein the encoding parameters become converted (PENCA. PENCB. PENCC. 
PENCD) to a different format before being linked to the frames of said audio or video data to be encoded. 

10 7. Apparatus according to claim 5 or 6. wherein said buffer means contain data fields including audio or video frame 
data (COE) and the associated encoding parameters (PAR). 
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